Speaker conversion in ARX-based source-

نویسنده

  • Hiroki Mori
چکیده

A speaker conversion framework for formant synthesis is proposed. With this framework, given a small set of a target speaker’s utterances, segmental features of an original speech can be converted to those of the given speaker. Unlike other speaker conversion frameworks, further voice quality modification can also be applied to the converted speech with conventional formant modification techniques. The parameter conversion is based on MLLR in the cepstral domain. The effect of parameter conversion can be seen from the graphical representation of formant placement. The results of an auditory experiment showed that most of the converted speech was perceived as being similar to that of target speakers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

طراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی

Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...

متن کامل

Voice Conversion Based on Speaker-Dependent Restricted Boltzmann Machines

This paper presents a voice conversion technique using speaker-dependent Restricted Boltzmann Machines (RBM) to build highorder eigen spaces of source/target speakers, where it is easier to convert the source speech to the target speech than in the traditional cepstrum space. We build a deep conversion architecture that concatenates the two speakerdependent RBMs with neural networks, expecting ...

متن کامل

Cross-language voice conversion based on eigenvoices

This paper presents a novel cross-language voice conversion (VC) method based on eigenvoice conversion (EVC). Crosslanguage VC is a technique for converting voice quality between two speakers uttering different languages each other. In general, parallel data consisting of utterance pairs of those two speakers are not available. To deal with this problem, we apply EVC to cross-language VC. First...

متن کامل

Sequence error (SE) minimization training of neural network for voice conversion

Neural network (NN) based voice conversion, which employs a nonlinear function to map the features from a source to a target speaker, has been shown to outperform GMM-based voice conversion approach [4-7]. However, there are still limitations to be overcome in NN-based voice conversion, e.g. NN is trained on a Frame Error (FE) minimization criterion and the corresponding weights are adjusted to...

متن کامل

To Investigate the Accuracy of the Dynamic Time Warping Based Transformation Function for Voice Conversion

Voice conversion involves transformation of speaker characteristics in a speech uttered by a speaker called source speaker so as to generate a speech having voice characteristics of a desired speaker called target speaker. Voice conversion technology is used in many applications namely dubbing, to enhance the quality of the speech, text-to-speech synthesizers, online games, multimedia, music, c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003